DNA sequence and amino acid sequence encoding the human rotavirus major outer capsid glycoprotein

ABSTRACT

PCT No. PCT/AU85/00096 Sec. 371 Date Feb. 4, 1987 Sec. 102(e) Date Feb. 4, 1987 PCT Filed Apr. 29, 1985 PCT Pub. No. WO85/05122 PCT Pub. Date Nov. 21, 1985.A material being a dsRNA gene segment coding for the major outer capsid glycoprotein of a rotavirus.

This application is a continuation of application Ser. No. 824,704,filed Feb. 4, 1987.

This invention relates to rotavirus, genes, gene segments, cloned genesand segments and products obtained therefrom including diagnosticreagents and vaccines.

Rotavirus is now recognized by the World Health Organization as a majorcause of infantile gastroenteritis, and a high priority has been placedon control of this disease by the production of a suitable vaccine (1).Cross-neutralization tests indicate four (or possibly five) (2-4)serotypes of human rotavirus and animal studies appear to show littlecross-protection between serotypes (5). Thus a potential vaccine mayhave to incorporate all the known human serotypes. The virus serotypehas recently been shown to be determined by the major outer shellglycoprotein (6-10) (a virus surface protein), and the gene segmentscoding for this protein from a bovine (UK) and a simian (SA11) rotavirushave recently been sequenced (11, 12). To date however, no such genefrom human rotavirus has been analysed. We therefor cloned and sequencedthe gene encoding this protein from a human rotavirus. Hu/5 (isolated inMelbourne, Australia) belonging to serotype 2.

The present invention provides a human rotavirus gene and a cloned humanrotavirus gene, the use of such genes to obtain expression of antigenicviral proteins such as in bacterial/procaryotic or eucaryotic expressionsystems and the expression products obtained and further includingvaccines and diagnostic reagents obtained therefrom.

The present invention also provides the dsRNA gene segment coding forthe major outer capsid glycoprotein of a human rotavirus and, withoutprejudice to the generality of the foregoing, that human rotavirus maybe Hu/Australia/5/77 (serotype 2), a DNA copy of same, a clone thereof,or a vector or a host cell containing same, peptide sequences obtainedtherefrom. Of particular interest are vectors such as plasmids obtainedtherefrom and host cell containing same.

The present invention also provides a material comprising a nucleotidesequence coding for at least part of the major outer capsid glycoproteinof a rotavirus.

In one instance the present invention provides at least one of thenucleotide sequences from nucleotide numbers 291-357, 480-513 and657-720 of a rotavirus major outer capsid glycoprotein gene.

In another instance the present invention provides at least one of theamino acid sequences from amino acid numbers 82-103, 144-155 and 204-224for which the nucleotide sequences of a rotavirus major outer capsidglycoprotein gene code.

In a particularly preferred instance the present invention provides amaterial comprising a nucleotide sequence encoding, or an amino acidsequence being,

a. an amino acid sequence of 22 amino acids commencing CLYYP andterminating TLS,

b. an amino acid sequence of 12 amino acids commencing YD andterminating SEL, or

c. an amino acid sequence of 21 amino acids commencing GIGC andterminating EKL,

and derived from a nucleotide sequence coding for a major outer capsidglycoprotein of a rotavirus.

Specific portions of cloned genes are provided by this invention and theinvention extends to products obtained therefrom including anti-sera oranti-bodies prepared by utilization of such amino acid sequences.

This invention will be exemplified by the following description.

MATERIALS AND METHODS Virus growth and purification

The human rotavirus Hu/5 (Hu/Australia/5/77) (13) was grown in MA104cells and purified as described previously (14).

Cloning rotavirus cDNA

The procedure for producing cDNA from rotavirus dsRNA, and cloning itinto the PstI site of the plasmid pBR322 has been described previouslyby Dyall-Smith et al. (15).

Identification of cloned copies of the major outer shell glycoproteingene of Hu/5 rotavirus

Since the UK bovine rotavirus gene encoding the major outer shellglycoprotein (gene 8 of this virus) had previously been cloned (11),this was used to screen the Hu/5 clones. To eliminte pBR322 sequences,the UK gene 8 clone was digested with PstI and the insert separated byagarose gel electrophoresis. The insert was then ³² p-labelled by nicktranslation (16) and hybridized to transformant bacterial colonies whichhad been lysed on nitrocellulose filters (17).

Northern blot analyses

Hu/5 dsRNA was separated on a polyacrylamide gel and immobilized onaminophenylthioether (APT) paper as described previously (7), exceptthat the RNA was loaded right along the tope of the stacking gel (whichwas not divided into wells). After transfer, the blot was cut(lengthwise) into strips and hybridized to ³² p-labelled cDNA or nicktranslated DNA probes. Labelled cDNA was prepared from Hu/5 segments 7,8 and 9 dsRNA (isolated by agarose gel electrophoresis) using reversetranscriptase (Life Sciences Inc. U.S.A.) and random primer DNA(prepared from calf thymus DNA) (18). Hybridization conditions were asfollows: blots were prehybridized for 15 min at 60° C. in 5×Denhardt'ssolution containing 10 mM HEPES (pH 7.0), 0.1% SDS, 3×SSC, 10 mug/ml E.coli tRNA, and 18 mug/ml herring sperm DNA, and then hybridized (18 hr,65° C.) to the appropriate DNA probe. Blots were washed twice for 15 minat 60° C. in 0.2×SSC containing 0.1% SDS, and exposed to x-ray film.

DNA sequencing

The pBR322 clone was digested with PstI, and the insert subcloned intothe PstI site of M13 mp8 (19). Sequences were determined from the M13ssDNA template by the chain termination method (20) using exonucleaseIII-treated restriction fragments (except the EcoRI/TaqI fragment) asprimers (21). A synthetic primer (5'-dGGTCACAT-3'), complementary to the3' end of the mRNA-sense strand was also used.

Electrophoresis of rotavirus dsRNA

dsRNA was extracted from purified virus preparations using a simplifiedversion of the method of Herring et al. (22). Briefly, 5 mul of apurified virus suspension was added to 200 mul of 0.1M sodium acetatebuffer (pH5.0) containing 1% sodium dodecyl. sulphate (SDS) and vortexedfor 1 min with an equal volume of `phenol`/chloroform mixture. Thephases were separated by a brief centrifugation (2', 10,000 g) and analiquot of the aqueous phase (5-20 mul) mixed with 20 mul of samplebuffer (25% (v/v) glycerol, 0.2% bromphenol blue, 0.4M Tris-Cl (pH6.8))and analysed on a 10% polyacrylamide gel (0.75 mm thick) using thebuffer system of Laemmli (23) (but without SDS). The gel was silverstained according to the method of Herring et al. (22), except that theincubation in silver nitrate was for 30 min instead of 2 hr, and sodiumborohydride was omitted from the developing solution. Degassing ofsolutions was also found to be unnecessary.

DESCRIPTION OF THE DRAWINGS

Reference will be made to the accompanying drawings in which:

FIG. 1 is polyacrylamide gel electrophoresis of rotavirus dsRNAextracted from A, Wa; B, Hu/5; and C, UK viruses. The eleven genesegments of Wa virus have been numbered from largest to smallest.

FIG. 2 is northern blot hybridizations identifying gene segment 8 ofHu/5 rotavirus as encoding the major outer shell glycoprotein. Track Ashows part of the ethidium bromide-stained polyacrylamide gel of Hu/5dsRNA (only segments 5-11 shown). The RNA bands were transferred toAPT-paper and the paper cut into strips (lengthwise). The blots werehybridized to ³² p-labelled DNA probes prepared from; B, RNA segments 7,8 and 9 of Hu/5 virus (to precisely locate these bands); C, a pBR322clone of UK virus segment 8 (the gene encoding the major outer shellglycoprotein of this virus), and D, a pBR322 clone of the glycoproteingene of Hu/5 virus.

FIG. 3 is a summary of the sequencing strategy used to determine thenucleotide sequence of the cloned DNA copy of dsRNA gene segment 8 ofHu/5 rotavirus. The number of nucleotides are indicated below the linerepresenting the clone, and the restriction sites used to generatesequencing primers are shown immediately above ( , AluI; ,Eco RI; ,TaqI; , BgIII; , HincII) A synthetic primer (5'-dGGTCACAT-3'),complementary to the 3' end of the mRNA-sense strand was also used(primer P). The orientation of the clone is such that the mRNA-sense DNAstrand is in the indicated 5' to 3' direction.

FIGS. 4 4a and 4b are a Nucleotide sequence and predicted amino-acidsequence of the mRNA-sense DNA strand of the segment 8 clone of Hu/5rotavirus. In phase termination codons are indicated by solid bars.

FIG. 5 is a comparison of the predicted amino-acid sequence of portionof the major outer shell glycoproteins of Hu/5 as compared to theequivalent regions of SA11 and UK rotavirus.

DETAILED DESCRIPTION OF THE INVENTION

The rotavirus genome consists of eleven dsRNA segments which upon gelelectrophoresis form a characteristic pattern of bands; the viruselectropherotype (24). The gel patterns of genomic RNA from the humanrotavirus Hu/5 (Hu/Australia/5/77) (13), Wa (25) (human, serotype 1) andUK (26) viruses are shown in FIG. 1, and demonstrate clearly that Hu/5has a "short" pattern (due to the positions of segments 10 and 11) (27,14) compared to the "long" gel patterns of the other two. The "short"pattern has previously been associated with serotype 2 human rotaviruses(27-29), and when the Hu/5 virus was serotyped in this laboratoryaccording to the method of Thouless et al. (30) (using typing antiserakindly supplied by M. Thouless and Wa, S2 (31) and SA11 (32) viruses asserotype 1, 2 and 3 reference strains) (4, 33) it was indeed found tobelong to serotype 2 (data not shown).

Hu/5 genomic dsRNA was converted into DNA and cloned into the PstI siteof pBR322 as described previously for UK rotavirus (15). Clones of themajor outer shell glycoprotein were identified using a probe (³²p-labelled by nick translation) prepared from a cloned glycoprotein genefrom UK bovine rotavirus (11). The identity of one of these clones wasconfirmed by Northern blot analyses which also mapped this gene tosegment 8 of Hu/5 rotavirus (FIG. 2). This clone was sequenced accordingto the strategy shown in FIG. 3 and the full, sequence is shown in FIG.4. The clone is a full-length copy of the glycoprotein gene since a) itis the same length (i.e. 1062 bp) as the corresponding UK and SA11genes, and b) it has the characteristic conserved 5' and 3' terminalsequences (34, 35). It has one open reading frame (the other framescontain multiple stop codons) capable of coding for a protein of 326amino acids, and 5' and 3' non-coding regions of 48 and 36 bprespectively. In these respects it is identical to UK and SA11glycoprotein genes (11, 12). The base sequence homologies of the Hu/5,SA11 and UK glycoprotein genes are as follows: Hu/5:UK or SA11=74% andUK:SA11=77.6%. They are obviously closely related.

When the predicted amino-acid sequence of the Hu/5 virus glycoproteingene was compared to those of UK and SA11 (FIG. 5) an even greaterdegree of similarity was observed. In pair-wise comparison theamino-acid sequence homologies are; Hu/5:UK:=75.8%, Hu/5:SA11:=75.2% andUK:SA11=85.6%. Studies with UK and SA11 viruses have shown that theglycosylation of these proteins is asparagine-linked and consists ofsimple ("high mannose") oligosaccharide moieties (36-38). Studies showthat all three proteins retain potential glycosylation site (of the typeAsn-X-_(Thr) ^(Ser)) at residue 69, which for SA11 is the only suchsite. The Hu/5 and UK proteins also have potential sites at residues 238(both), 146 (Hu/5) and 318 (UK), however the distribution ofcarbohydrate in these proteins is not known.

All glycoproteins of eukaryotic cells require a signal sequence forvectorial transport across the endoplasmic reticulum (39). Using thegeneral rules proposed by Perlman and Halvorson (40) a typical signalsequence can be discerned in the first 25 residues of the 3 rotavirusglycoproteins. Their putative hydrophobic core sequences (res. 6-19) arepreceded by the charged residue Glu- (res. 5). The likely cleavage sitesare after serine at position 15, or after position 25 (Ser/Thr). Recentstudies with SA11 virus (41) have demonstrated a cleaved signal sequencefor this protein with a molecular weight (1,500 MW) consistent with theearlier predicted cleavage site. It is interesting that the first 25residues of all three glycoproteins show relatively greater conservationthan the subsequent 25.

While the glycoproteins of Hu/5, UK and SA11 are very similar inamino-acid sequence, they must differ in antigenically significantregions since the three viruses are serotypically different, i.e. Hu/5is a human serotype 2 virus, UK belongs to a bovine serotype (33), andSA11 although of simian origin is serologically human type 3 (33).Results of competition experiments using monoclonal antibodies to SA11virus have demonstrated only one or possibly two epitopes involved inneutralization (42).

To locate the major antigenic regions of the glycoprotein we have usedmonoclonal antibodies which neutralize SA11 rotavirus. By selectingmutants resistant to neutralization and sequencing their glycoproteingenes we were able to identify three (A, B and C) important regions (M.L. Dyall-Smith, I. Lazdins, G. W. Tregear and I. H. Holmes, manuscriptin preparation for publication). These are amino acids 82-103.sup.(A),144-155.sup.(B) and 204-224.sup.(C), at which region C appears to be themost important. A mutation in the C region at amino acid 211 caused aten fold decrease in the ability of polyclonal antiviral anti serum toneutralize virus, indicating that this is a site of major antigenicimportance.

The sequence data (above) support the wealth of serological evidence(43-45) that rotaviruses are a closely related group. Indeed they appearto be much more closely related than the three serotypes of mammalianreovirus, which are structurally and epidemiologically similar torotaviruses (46). The genes encoding the serotype-specific protein ofthe three reovirus serotypes are related only to the extent of 1-12%(47). The fact that two simian rotaviruses, SA11 and rhesus (MM18006)are serologically closely related (33) yet were isolated over 20 yearsapart (48, 49) also suggests that rotavirus serotypes are fairly stableantigenically, unlike influenza A subtypes which show antigenic drift(50). While many more rotavirus glycoprotein genes need to be studied,the limited number of human serotypes so far detected and the apparentlylow level of antigenic drift look encouraging for the development ofhuman rotavirus vaccines.

Concerning vaccine preparation, in general it will be best if therotavirus genetic material of this invention is introduced into abacterium and this may be effected in accordance with the procedures ofFormal et al (51), Silhavy et al (52) or Roberts et al (53).

REFERENCES

1. Bull. W.H.O. (1983) 61, 251-254.

2. Sato, K., Inaba, Y., Miura, Y., Tokuhisa, S. and Matumoto, M. (1982).Arch. Virol. 73, 45-50.

3. Thouless, M. E., Beards, M. and Flewett, T. H. (1982). Arch. Virol.73, 219-230.

4. Wyatt, R. G., James, H. D., Pittman, A. L., Hoshino, Y., Greenberg,H. B., Kalica, A. R., Flores, J. and Kapikian, A. Z. (1983). J. Clin.Micro. 18, 310-317.

5. Gaul, S. K., Simpson, T. F., Woode, G. N. and Fulton, R. W. (1982).J. Clin. Micro. 16, 495-503.

6. Kalica, A. R., Greenberg, H. B., Wyatt, R. G., Flores, J., Sereno, M.M., Kapikian, A. Z. and Chanock, R. M. (1981). Virology 112, 385-390.

7. Dyall-Smith, M. L., Azad, A. A. and Holmes, I. H. (1983). J. Virol.46, 317-320.

8. Kantharidis, P., Dyall-Smith, M. L. and Holmes, I. H. (1983). J.Virol. 48, 330-334.

9. Bastardo, J. W. McKimm-Breschkin, J. L., Sonza, S., Mercer, L. D. andHolmes, I. H. (1981). Infect. Immun. 34, 641-647.

10. Sonza, S., Breschkin, A. M. and Holmes I. H. (1983). J. Virol. 45,1143-1146.

11. Elleman, T. C., Hoyne, P. A., Dyall-Smith, M. L., Holmes, I. H. andAzad, A. A. (1983). Nucleic Acids Res. 11, 4689-4701.

12. Both, G. W., Mattick, J. S. and Bellamy, A. R. (1983). Proc. Natl.Acad. Sci. U.S.A. 80, 3091-3095.

13. Albert, M. J. and Bishop, R. F. (1984). J. Med. Virol. In press.

14. Dyall-Smith, M. L. and Holmes, I. H. (1981). J. Virol. 38,1099-1103.

15. Dyall-Smith, M. L., Elleman, T. C., Hoyne, P. A., Holmes, I. H. andAzad, A. A. (1983). Nucleic Acids Res. 11, 3351-3362.

16. Rigby, P. W. S., Dieckman, M., Rhodes, C. and Berg, P. (1977). J.Mol. Biol. 113, 237-251.

17. Grunstein, M. and Hogness, D. S. (1975). Proc. Nat. Acad. Sci.,U.S.A. 72, 3961-3965.

18. Taylor, J. M. Illemensee. R. and Summers. J. (1976). Biochim.Biophys. Acta 442, 324-330.

19. Messing, J., Crea, R. and Seeberg, P. H. (1981). Nucleic Acids Res.9, 309-321.

20. Sanger, F., Nicklen, S. and Coulson, A. R. (1977). Proc. Natl. Acad.Sci. U.S.A. 74, 5463-5467.

21. Smith, A. J. H. (1979). Nucleic Acids Res. 6, 831-848.

22. Herring, A. J., Inglis, N. F., Ojeh, C. K., Snodgrass, D. R. andMenzies, J. D. (1982). J. Clin. Microbiol. 16, 473-477.

23. Laemmli, U. K. (1970). Nature 227, 680-685.

24. Holmes, I. H. (1983). In "Reoviridae" (ed. W. K. Joklik) pp. 365-367(Plenum, N.Y.).

25. Wyatt, R. G., James, W. D., Bohl, E. H., Theil, K. W., Saif, L. J.,Kalica, A. R., Greenberg, H. B., Kapikian, A. Z. and Chanock, R. M.(1980). Science 207, 189-191.

26. Woode, G. N., Bridger, J. C., Hall, G. and Dennis, M. J. (1974).Res. Vet. Sci. 16, 102-105.

27. Rodger, S. M., Bishop, R. F., Birch, C., McLean, B. and Holmes, I.H. (1981). J. Clin. Microbiol. 13, 272-278.

28. Kalica, A. R., Greenberg, H. B., Espejo, R. T., Flores, J., Wyatt,R. G., Kapikian, A. Z. and Chanock, R. M. (1981). Infect. Immun. 33,958-961.

29. Beards, G. M. (1982). Archiv. Virol. 74, 65-70.

30. Thouless M. E., Beards, G. M. and Flewett, T. H. (1982). Arch.Virol. 73, 219-230.

31. Urasawa, S. Urasawa. T. and Taniguchi, K. (1982). Infect. Immun. 38,781-784.

32. Rodger. S. M., Schnagl, R. D. and Holmes, I. H. (1977). J. Virol.24, 91-98.

33. Wyatt, R. G., Greenberg, H. B., James, W. D., Pittman, A. L. Kalica,A. R., Flores, J., Chanock, R. M. and Kapikian, A. Z. (1982). Infect.Immun. 37, 110-115.

34. Clarke, I. N. and McCrae, M. A. (1983). J. Gen. Virol. 64,1877-1884.

35. Imai, M., Akatani, K., Ikegami, N. and Furuichi, Y. (1983). J.Virol. 47, 125-136.

36. Ericson, B. L., Graham, D. Y. Mason, B. B. and Estes, M. K. (1982).J. Virol. 42, 825-839.

37. Arias. C. F., Lopez, S. and Espejo, R. T. (1982). J. Virol. 41,42-50.

38. McCrae, M. A. and Faulkner-Valle, G. P. (1981). J. Virol. 39,490-496.

39. Kreil, G. (1981). Annu. Rev. Biochem. 50, 317-348.

40. Perlman, D. and Halvorson, H. O. (1983). J. Mol. Biol. 167, 391-409.

41. Ericson, B. L., Graham, D. Y., Mason, B. B. Hanssen, H. H. andEstes, M. K. (1983). Virology 127, 320-332.

42. Sonza, S., Breschkin, A. M. and Holmes, I. H. (1984). Virology 134,318-327.

43. Kapikian, A. Z., Cline, W. L., Kim, H. W., Kalica, A. R., Wyatt, R.G. van Kirk, D. H., Channock, R. M., James, H. D., Jr. and Vaughn, A. L.(1976). Proc. Soc. Exp. Biol. Med. 152, 535-539.

44. Woode, G. N., Bridger, J. C., Jones, J. M., Flewett, T. H., Bryden,A. S., Davies, H. A. and White, G. B. B. (1976). Infect. Immunol. 14,804-810.

45. Thouless, M. E., Bryden, A. S., Flewett, T. H. Woode, G. N.,Bridger, J. C., Snodgrass, D. R. and Herring, J. A. (1977). Arch. Virol.53, 287-294.

46. Joklik, W. K. (1983). In "The Reoviridae" (ed. W. K. Joklik) pp. l-7(Plenum, N.Y., (1983).

47. Gaillard, R. K. and Joklik, W. K. (1982). Virology 123, 152-164.

48. Malherbe, H. H. and Stricland-Cholmley, M. (1967). Arch. GesamteVirusforsch. 22, 235-245.

49. Stuker, G., Oshiro, L. S. and Schmidt, N. J. (1980). J. Clin.Microbiol. 11, 202-203.

50. Both, G. W., Sleigh, M. J., Cox, N. J. and Kendal, A. P. (1983). J.Virol. 48, 52-60.

51. S. B. Formal, L. S. Baron, D. J. Kopecko, 0. Washington, C. Powelland C. A. Life (1981). Construction of a potential bivalent vaccinestrain=introduction of Shigella sonnei form 1 antigen genes into the galE Salmonella typhi Ty 21a typhoid vaccine strain. Infect. Immun. 34,746-750.

52. T. J. Silhavy, H. Shumn, J. Beckwith and M. Schwartz (1977). Use ofgene fusions to study outer membrane protein localization in Escherichiacoli. Proc. Natl. Acad. Sci. USA 74, 5411-5415.

53. T. M. Roberts, I. Birel, R. R. Yocum, D. M. Livingston and M.Ptashne (1979) Synthesis of simian virus 40 antigen in Escherichia coli.Proc. Natl. Acad. Sci. USA 76, 5596-5600.

The claims form part of the disclosure of this specification.

Modifications and adaptations may be made to the above described withoutdeparting from the spirit and scope of this invention which includesevery novel feature and combination of features disclosed herein.

We claim:
 1. An isolated nucleic acid encoding the major outer capsidglycoprotein of human serotype 2 rotavirus wherein said nucleic acidcomprises the nucleotide sequence:

    ______________________________________                                        5'-GGCTTTAAAAACGAGAATTTCCGT                                                   CTGGCTAGCGGTTAGCTCTTTTTA.sup.48                                               ATG   TAT     GGT     ATT   GAA   TAT   ACC                                   ACA        ATT     CTG     ACC   ATT    TTG.sup.87                            ATA   TCT     ATC     ATA   TTA   TTG   AAT                                   TAT        ATA     TTA     AAA   ACT.sup.123                                  ATA   ACT     AAT     ACG   ATG   GAC   TAC                                   ATA        ATT     TTC     AGG   TTT    TTA.sup.162                           CTA   CTC     ATT     GCT   TTA   ATA   TCA                                   CCA        TTT     GTA     AGG   ACA.sup.198                                  CAA   AAT     TAT     GGC   ATG   TAT   TTA                                   CCA        ATA     ACG     GGA   TCA    CTA.sup.237                           GAC   GCT     GTA     TAT   ACG   AAT   TCT                                   ACT        AGT     GGA     GAG   CCA.sup.273                                  TTT   TTA     ACT     TCG   ACG   CTG   TGT                                   TTA        TAC     TAT      CCA  GCA    GAA.sup.312                           GCT   AAA     AAT     GAG   ATT   TCA   GAT                                   GAT        GAA     TGG     GAA   AAT.sup.348                                  ACT   TTA     TCA     CAA   TTA   TTT   TTA                                   ACT        AAA     GGA     TGG   CCA    ATT.sup.387                           GGA   TCA     GTT     TAT   TTT   AAA   GAC                                   TAC        AAT     GAT     ATT   AAT.sup.423                                  ACA   TTT     TCT     GTG   AAT   CCA   CAA                                   CTA        TAT     TGT     GAT   TAT    AAT.sup.462                           GTA   GTA     TTG     ATG   AGA   TAT   GAC                                   AAT        ACA     TCT     GAA   TTA.sup.498                                  GAT   GCA     TCA     GAG   TTA   GCA   GAT                                   CTT        ATA     TTG     AAT   GAA    TGG.sup.537                           CTG   TGC     AAT     CCT   ATG   GAT   ATA                                   TCG        CTT     TAC     TAT   TAT.sup. 573                                 CAA   CAA     AGT     AGC   GAA   TCA   AAT                                   AAA        TGG     ATA     TCG   ATG    GGA.sup.612                           ACA   GAC     TGC     ACG   GTA   AAA   GTT                                   TGT        CCA     CTC     AAT   ACA.sup.648                                  CAA   ACC     CTA     GGG   ATT   GGA   TGC                                   AAA        ACT     ACG     GAT   GTA    AAC.sup.687                           ACA   TTT     GAG     ATT   GTT   GCG   TCG                                   TCT        GAA     AAA     TTA   GTA.sup.723                                  ATT   ACT     GAC     GTT   CTA   AAT   GGT                                   GTT        AAC     CAT     AAC   ATA    AAT.sup.762                           ATT   TCA     ATA     AAT   ACG   TGC   ACT                                   ATA        CGC     AAC     TGT   AAT.sup.798                                  AAA   TTA     GGA     CCA   CGA   GAA   AAT                                   GTT        GCT     ATA     ATT   GAA    GTT.sup.837                            GGT  GGA     CCG     AAC   GCA   TTA   GAT                                   ATC        ACT     GCT     GAT   CCA.sup.873                                  ACA   ACA     GTC     CCA   CAA   GTT   CAA                                   AGA        ATC     ATG     CGA   ATA    AAT.sup.912                           TGG   AAA     AAA     TGG   TGG   CAA   GTA                                   TTT        TAT     ACA     GTA   GTT.sup.948                                  GAC   TAT     ATT     AAC   CAA   GTT   ATA                                   CAA        GTC     ATG     TCC   AAA    CGA.sup.987                           TCA   AGA     TCA     TTA   GAC   GCA   GGT                                   GCT        TTT     TAT     TAT   AGA.sup.1023                                 ATT   TAG                                                                     ATATAGATTTGGTCAGATTTGTATGATGTGACC-3'.sup.1062.                                ______________________________________                                    


2. A vector comprising the isolated nucleic acid of claim
 1. 3. A hostcell comprising the vector of claim
 2. 4. An isolated major outer capsidglycoprotein of which comprises the amino acid sequence:

    __________________________________________________________________________    Met Tyr                                                                              Gly                                                                              Ile                                                                              Glu                                                                              Tyr                                                                              Thr                                                                              Thr                                                                              Ile                                                                              Leu                                                                              Thr                                                                              Ile  Leu.sub.13                             Ile Ser                                                                              Ile                                                                              Ile                                                                              Leu                                                                              Leu                                                                              Asn                                                                              Tyr                                                                              Ile                                                                              Leu                                                                              Lys                                                                              Thr.sub.25                                  Ile Thr                                                                              Asn                                                                              Thr                                                                              Met                                                                              Asp                                                                              Tyr                                                                              Ile                                                                              Ile                                                                              Phe                                                                              Arg                                                                              Phe  Leu.sub.38                             Leu Leu                                                                              Ile                                                                              Ala                                                                              Leu                                                                              Ile                                                                              Ser                                                                              Pro                                                                              Phe                                                                              Val                                                                              Arg                                                                              Thr.sub.50                                  Gln Asn                                                                              Tyr                                                                              Gly                                                                              Met                                                                              Tyr                                                                              Leu                                                                              Pro                                                                              Ile                                                                              Thr                                                                              Gly                                                                              Ser  Leu.sub.63                             Asp Ala                                                                              Val                                                                              Tyr                                                                              Thr                                                                              Asn                                                                              Ser                                                                              Thr                                                                              Ser                                                                              Gly                                                                              Glu                                                                              Pro.sub.75                                  Phe Leu                                                                              Thr                                                                              Ser                                                                              Thr                                                                              Leu                                                                              Cys                                                                              Leu                                                                              Tyr                                                                              Tyr                                                                              Pro                                                                              Ala  Glu.sub.88                             Ala Lys                                                                              Asn                                                                              Glu                                                                              Ile                                                                              Ser                                                                              Asp                                                                              Asp                                                                              Glu                                                                              Trp                                                                              Glu                                                                              Asn.sub.100                                 Thr Leu                                                                              Ser                                                                              Gln                                                                              Leu                                                                              Phe                                                                              Leu                                                                              Thr                                                                              Lys                                                                              Gly                                                                              Trp                                                                              Pro  Ile.sub.113                            Gly Ser                                                                              Val                                                                              Tyr                                                                              Phe                                                                              Lys                                                                              Asp                                                                              Tyr                                                                              Asn                                                                              Asp                                                                              Ile                                                                              Asn.sub.125                                 Thr Phe                                                                              Ser                                                                              Val                                                                              Asn                                                                              Pro                                                                              Gln                                                                              Leu                                                                              Tyr                                                                              Cys                                                                              Asp                                                                              Tyr  Asn.sub.138                            Val Val                                                                              Leu                                                                              Met                                                                              Arg                                                                              Tyr                                                                              Asp                                                                              Asn                                                                              Thr                                                                              Ser                                                                              Glu                                                                              Leu.sub.150                                 Asp Ala                                                                              Ser                                                                              Glu                                                                              Leu                                                                              Ala                                                                              Asp                                                                              Leu                                                                              Ile                                                                              Leu                                                                              Asn                                                                              Glu  Trp.sub.162                            Leu Cys                                                                              Asn                                                                              Pro                                                                              Met                                                                              Asp                                                                              Ile                                                                              Ser                                                                              Leu                                                                              Tyr                                                                              Tyr                                                                              Tyr.sub.175                                 Gln Gln                                                                              Ser                                                                              Ser                                                                              Glu                                                                              Ser                                                                              Asn                                                                              Lys                                                                              Trp                                                                              Ile                                                                              Ser                                                                              Met  Gly.sub.188                            Thr Asp                                                                              Cys                                                                              Thr                                                                              Val                                                                              Lys                                                                              Val                                                                              Cys                                                                              Pro                                                                              Leu                                                                              Asn                                                                              Thr.sub.200                                 Gln Thr                                                                              Leu                                                                              Gly                                                                              Ile                                                                              Gly                                                                              Cys                                                                              Lys                                                                              Thr                                                                              Thr                                                                              Asp                                                                              Val  Asn.sub.213                            Thr Phe                                                                              Glu                                                                              Ile                                                                              Val                                                                              Ala                                                                              Ser                                                                              Ser                                                                              Glu                                                                              Lys                                                                              Leu                                                                              Val.sub.225                                 Ile Thr                                                                              Asp                                                                              Val                                                                              Val                                                                              Asn                                                                              Gly                                                                              Val                                                                              Asn                                                                              His                                                                              Asn                                                                              Ile  Asn.sub.238                            Ile Ser                                                                              Ile                                                                              Asn                                                                              Thr                                                                              Cys                                                                              Thr                                                                              Ile                                                                              Arg                                                                              Asn                                                                              Cys                                                                              Asn.sub.250                                 Lys Leu                                                                              Gly                                                                              Pro                                                                              Arg                                                                              Glu                                                                              Asn                                                                              Val                                                                              Ala                                                                              Ile                                                                              Ile                                                                              Gln  Val.sub.263                            Gly Gly                                                                              Pro                                                                              Asn                                                                              Ala                                                                              Leu                                                                              Asp                                                                              Ile                                                                              Thr                                                                              Ala                                                                              Asp                                                                              Pro.sub.275                                 Thr Thr                                                                              Val                                                                              Pro                                                                              Gln                                                                              Val                                                                              Gln                                                                              Arg                                                                              Ile                                                                              Met                                                                              Arg                                                                              Ile  Asn.sub.288                            Trp Lys                                                                              Lys                                                                              Trp                                                                              Trp                                                                              Gln                                                                              Val                                                                              Phe                                                                              Tyr                                                                              Thr                                                                              Val                                                                              Val.sub.300                                 Asp Tyr                                                                              Ile                                                                              Asn                                                                              Gln                                                                              Val                                                                              Ile                                                                              Gln                                                                              Val                                                                              Met                                                                              Ser                                                                              Lys  Arg.sub.313                            Ser Arg                                                                              Ser                                                                              Leu                                                                              Asp                                                                              Ala                                                                              Ala                                                                              Ala                                                                              Phe                                                                              Tyr                                                                              Tyr                                                                              Arg.sub.325                                 Ile.sub.326.                                                                  __________________________________________________________________________